library(dplyr)
library(Hmisc)
library(rms)
library(summarytools)
library(consort)
library(gtsummary)
library(gt)
library(ggplot2)
library(glue)
library(htmlwidgets)

This .Rmd file was written with the aim to aid you when writing your first reports in R Markdown. It contains the most important features for creating a good reproducible report. The structure followed is not necessarily the one you should follow when performing your data analysis or reporting your results. The purpose of this template is simply to present you with a good structure for the data report and the code behind it.

The structure used in this R Markdown report is the following:

  • Background - describe in a concise way the background of the research field, the research question of interest, etc
  • Data dictionary - makes it easier for the reader to know the meaning of each variable. Can select those variables of interest for analysis, no need to show the whole data set
  • Flowchart - Show how many patients are excluded from the analysis
  • Descriptive statistics - show univariable analysis for continuous and categorical variables so the audience has a good overlook of the data.
  • Table 1 - There are better ways of describing your data, but everybody wants to see a Table 1
  • Model - Visualize your model and your effects
  • Conclusion
  • Computer environment - always include the computer environment used. It fosters reproducibility
  • Update history - optional. To show changes over time in your report.

Through this report I have used different chunk options. Below a short summary of the ones I have used. For a detail explanation of these and many other chunk options, please check this.

  • echo = FALSE hide code (default is TRUE hence code is displayed) but result is shown
  • warning = FALSE drop warnings from output document (default is TRUE)
  • message = FALSE suppress messages (default is TRUE)
  • eval = FALSE do not evaluate the code (default is TRUE) but the code appears if echo=T was chosen
  • include = FALSE to hide both the code and the result from the rendered document but the code is still evaluated
  • fig.align = to align the figure. Four options exist default, left, right or center
  • fig.cap = write a figure caption
  • fig.dim = c(7,8) choose dimensions of plot. Is a shorthand of fig.width = 7, fig.height = 8.
  • out.width = 70% and out.height = 80% width and height of the plot in the output document, which can be different with its physical fig.width and fig.height, i.e., plots can be scaled in the output document.
  • dpi = dots per inch value (for image resolution). Default is 72

Note that this .Rmd file was written with the intention to knit to HTML. I believe that HTML files have more advantages than pdf or word files and are easier to present to your PI or colleagues.

How to insert references in your Report?

I personally prefer to use hyperlinks than references because it allows users to access the specific manuscript or book directly. However, a reference Section at the end of your Report might be of interest to have a quick overview of the bibliography used.

To do this you will need to create a plain-text file with the .bib extension that consists of bibliography entries. You can do this automatically with the knitr::write_bib() function. This function uses utils::citation() and utils::toBibtex() to create bib entries for R packages and write them in a file.

# automatically create a bib database for R packages 
knitr::write_bib(x = c('knitr', 'rmarkdown'), #specify package name(s).
                 file = 'packages.bib') # the .bib file to write

You can also use the .packages() argument. This will create a bib entry for all R packages loaded in the report, on top of the packages you specified manually.

knitr::write_bib(x = c(.packages(), 'knitr', 'rmarkdown'), # .packages() will create a bib entry for all R packages used in the report
                 file = 'packages.bib') # the .bib file to write

Items can be cited directly within the documentation using the syntax @key where key is the citation key in the first line of the entry. For example:

The main packages needed to create an R Markdown report are the rmarkdown package (Allaire et al. 2022) and the knitr package (Xie 2022). To read more about the rmarkdown package, read the R Markdown: the definitive guide book (Xie, Allaire, and Grolemund 2018) or the R Markdown Cookbook (Xie, Dervieux, and Riederer 2020).

The references will be automatically created at the end of your R Markdown report file. Importantly, to insert references in your R Markdown report you will need to specify a bibliography file using the bibliography metadata field in YAML.

You can also create a .bib file manually and insert references by hand. Simply use the utils::citation() function to see how to cite R and R packages.

citation(package = "base")
## 
## To cite R in publications use:
## 
##   R Core Team (2021). R: A language and environment for statistical
##   computing. R Foundation for Statistical Computing, Vienna, Austria.
##   URL https://www.R-project.org/.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {R: A Language and Environment for Statistical Computing},
##     author = {{R Core Team}},
##     organization = {R Foundation for Statistical Computing},
##     address = {Vienna, Austria},
##     year = {2021},
##     url = {https://www.R-project.org/},
##   }
## 
## We have invested a lot of time and effort in creating R, please cite it
## when using it for data analysis. See also 'citation("pkgname")' for
## citing R packages.

You would copy the output BibTeX entry into your .bib file.

Background

The Gusto-I trial was a randomized trial comparing 4 thrombolytic strategies for acute myocardial infarction. The original study can be found here.

knitr::include_graphics("./Figures/Gusto_NEJM.png") #need to download Figure from Repository (or simply suppress this chunk with eval=FALSE)
GUSTO-I trial published in the NEJM.

GUSTO-I trial published in the NEJM.

Our aim is to derive a model for predicting 30-day mortality. Because this is simply a template for showing how to use R Markdown, I will not go into much detail and will simply get to the stage of model derivation and showing partial effect plots. Model validation with a resampling technique (e.g. bootstrap internal validation) will not be shown. Important aspects like discrimination and calibration of the model, shrinkage of coefficients for protection of future overfitting or plotting decision curve analysis for assessing net benefit of the model will not be shown here. 1

Data

Data is obtained from http://hbiostat.org/data courtesy of the Vanderbilt University Department of Biostatistics.

load(url('http://hbiostat.org/data/gusto.rda'))
df <- as.data.frame(gusto)

Data dictionary

Hmisc::html(Hmisc::contents(df), levelType = 'table')

Data frame:df

40830 observations and 29 variables, maximum # NAs:10320  
NameLabelsUnitsLevelsClassStorageNAs
day30integer 0
shointeger 0
higinteger 0
diaDiabetesintegerinteger 0
hypinteger 0
hrtinteger 0
ttrTime to Relief of Chest Pain > 1hintegerinteger 0
sexSex2integer 0
KillipKillip Class4integer 0
agedouble 0
steNumber of ECG Leads with ST Elevationnumericinteger 0
pulseHeart Ratebeats/minintegerinteger 0
sysbpSystolic Blood PressuremmHgintegerinteger 0
antinteger 0
milocMI Location3integer 0
heightHeightcmnumericdouble 0
weightWeightkgnumericdouble 0
pmiPrevious MI2integer 0
htnHistory of Hypertensionintegerinteger 0
smkSmoking3integer 0
paninteger 0
famFamily History of MIintegerinteger 0
prevcvdinteger 0
prevcabginteger 0
reglinteger 0
grplinteger 0
grpsinteger 0
tpainteger10320
txTx in 3 groups3integer 0

VariableLevels
sexmale
female
KillipI
II
III
IV
milocInferior
Other
Anterior
pmino
yes
smknever
quit
current
txSK+tPA
SK
tPA

Patient Flowchart

We want to keep patients who received either streptokinase (SK) or primary angioplasty (tPA).

# dpi (dots per inch) for bitmap devices (dpi * inches = pixels). Default is 72
docl <- list()
docl[['all']] <- nrow(df)
docl[['sk_tPA']] <- table(df$tx)[1]
df <- df %>% filter(tx %in% c('SK', 'tPA'))
df$tx <- droplevels(df$tx) #tx was a three level categorical variable. Drop unused level (SK+tPA)
docl[['analysis']] <- nrow(df)

txt1 <- glue::glue("Patients recruited in \n the GUSTO-I trial \n (n={docl$all})")
txt1_side <- glue::glue("Excluded (n={docl$sk_tPA}):\n\u2022 Received SK+tPA (n={docl$sk_tPA})")
txt2 <- glue::glue("Main analysis \n (n={docl$analysis})")

flowchart <- consort::add_box(txt = txt1) %>% 
  consort::add_side_box(txt = txt1_side, dist = 0.05) %>%   
  consort::add_box(txt = txt2, dist = 0.05) 

plot(flowchart)
Patient flowchart using the {consort} package.

Patient flowchart using the {consort} package.

Descriptive statistics

Summary using summarytools

print(summarytools::dfSummary(df, 
                              varnumbers = FALSE,
                              valid.col = FALSE,
                              labels.col = FALSE,
                              graph.magnif = 0.76,
                              headings = F), 
      max.tbl.height = 300,
      method = "render")
Variable Stats / Values Freqs (% of Valid) Graph Missing
day30 [integer]
Min : 0
Mean : 0.1
Max : 1
0:28382(93.0%)
1:2128(7.0%)
0 (0.0%)
sho [integer]
Min : 0
Mean : 0
Max : 1
0:29864(97.9%)
1:646(2.1%)
0 (0.0%)
hig [integer]
Min : 0
Mean : 0.5
Max : 1
0:15603(51.1%)
1:14907(48.9%)
0 (0.0%)
dia [labelled, integer]
Min : 0
Mean : 0.1
Max : 1
0:25967(85.1%)
1:4543(14.9%)
0 (0.0%)
hyp [integer]
Min : 0
Mean : 0.1
Max : 1
0:28004(91.8%)
1:2506(8.2%)
0 (0.0%)
hrt [integer]
Min : 0
Mean : 0.3
Max : 1
0:20551(67.4%)
1:9959(32.6%)
0 (0.0%)
ttr [labelled, integer]
Min : 0
Mean : 0.6
Max : 1
0:10687(35.0%)
1:19823(65.0%)
0 (0.0%)
sex [labelled, factor]
1. male
2. female
22795(74.7%)
7715(25.3%)
0 (0.0%)
Killip [labelled, factor]
1. I
2. II
3. III
4. IV
26007(85.2%)
3857(12.6%)
417(1.4%)
229(0.8%)
0 (0.0%)
age [numeric]
Mean (sd) : 60.9 (11.9)
min ≤ med ≤ max:
19 ≤ 61.6 ≤ 110
IQR (CV) : 17.7 (0.2)
5342 distinct values 0 (0.0%)
ste [labelled, integer]
Mean (sd) : 4.1 (1.9)
min ≤ med ≤ max:
0 ≤ 4 ≤ 11
IQR (CV) : 3 (0.5)
12 distinct values 0 (0.0%)
pulse [labelled, integer]
Mean (sd) : 75.4 (17.8)
min ≤ med ≤ max:
0 ≤ 73 ≤ 220
IQR (CV) : 24 (0.2)
157 distinct values 0 (0.0%)
sysbp [labelled, integer]
Mean (sd) : 129 (23.9)
min ≤ med ≤ max:
0 ≤ 129.5 ≤ 280
IQR (CV) : 32 (0.2)
196 distinct values 0 (0.0%)
ant [integer]
Min : 0
Mean : 0.4
Max : 1
0:18644(61.1%)
1:11866(38.9%)
0 (0.0%)
miloc [labelled, factor]
1. Inferior
2. Other
3. Anterior
17582(57.6%)
1062(3.5%)
11866(38.9%)
0 (0.0%)
height [labelled, numeric]
Mean (sd) : 171 (9.4)
min ≤ med ≤ max:
140 ≤ 172 ≤ 212.5
IQR (CV) : 12.8 (0.1)
506 distinct values 0 (0.0%)
weight [labelled, numeric]
Mean (sd) : 79.4 (15.8)
min ≤ med ≤ max:
36.7 ≤ 78 ≤ 213
IQR (CV) : 18.5 (0.2)
644 distinct values 0 (0.0%)
pmi [labelled, factor]
1. no
2. yes
25452(83.4%)
5058(16.6%)
0 (0.0%)
htn [labelled, integer]
Min : 1
Mean : 1.4
Max : 2
1:18875(61.9%)
2:11635(38.1%)
0 (0.0%)
smk [labelled, factor]
1. never
2. quit
3. current
13089(42.9%)
8371(27.4%)
9050(29.7%)
0 (0.0%)
pan [integer]
Min : 1
Mean : 1.4
Max : 2
1:19165(62.8%)
2:11345(37.2%)
0 (0.0%)
fam [labelled, integer]
Min : 1
Mean : 1.4
Max : 2
1:17531(57.5%)
2:12979(42.5%)
0 (0.0%)
prevcvd [integer]
Min : 1
Mean : 1
Max : 2
1:29885(98.0%)
2:625(2.0%)
0 (0.0%)
prevcabg [integer]
Min : 1
Mean : 1
Max : 2
1:29145(95.5%)
2:1365(4.5%)
0 (0.0%)
regl [integer]
Mean (sd) : 8.6 (4.5)
min ≤ med ≤ max:
1 ≤ 9 ≤ 16
IQR (CV) : 7 (0.5)
16 distinct values 0 (0.0%)
grpl [integer]
Mean (sd) : 25.3 (14)
min ≤ med ≤ max:
1 ≤ 26 ≤ 48
IQR (CV) : 24 (0.6)
48 distinct values 0 (0.0%)
grps [integer]
Mean (sd) : 61.6 (34.8)
min ≤ med ≤ max:
1 ≤ 62 ≤ 121
IQR (CV) : 59 (0.6)
121 distinct values 0 (0.0%)
tpa [integer]
Min : 0
Mean : 0.3
Max : 1
0:20162(66.1%)
1:10348(33.9%)
0 (0.0%)
tx [factor]
1. SK
2. tPA
20162(66.1%)
10348(33.9%)
0 (0.0%)

Generated by summarytools 1.0.1 (R version 4.1.2)
2022-12-04

Summary using Hmisc

d <- Hmisc::describe(df)
Hmisc::html(d) #use scroll = TRUE for scrolling the output table
df

29 Variables   30510 Observations

day30
nmissingdistinctInfoSumMeanGmd
30510020.19521280.069750.1298

sho
nmissingdistinctInfoSumMeanGmd
30510020.0626460.021170.04145

hig
nmissingdistinctInfoSumMeanGmd
30510020.75149070.48860.4998

dia: Diabetes
nmissingdistinctInfoSumMeanGmd
30510020.3845430.14890.2535

hyp
nmissingdistinctInfoSumMeanGmd
30510020.22625060.082140.1508

hrt
nmissingdistinctInfoSumMeanGmd
30510020.6699590.32640.4398

ttr: Time to Relief of Chest Pain > 1h
nmissingdistinctInfoSumMeanGmd
30510020.683198230.64970.4552

sex: Sex
nmissingdistinct
3051002
 Value        male female
 Frequency   22795   7715
 Proportion  0.747  0.253
 

Killip: Killip Class
image
nmissingdistinct
3051004
 Value          I    II   III    IV
 Frequency  26007  3857   417   229
 Proportion 0.852 0.126 0.014 0.008
 

age
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
3051005342160.9113.5840.9244.7352.1161.5869.8476.1979.42
lowest : 19.027 20.781 20.969 20.984 21.449 , highest: 91.938 92.328 96.547 108.000 110.000
ste: Number of ECG Leads with ST Elevation
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
305100120.9614.0932.111234677
lowest : 0 1 2 3 4 , highest: 7 8 9 10 11
 Value          0     1     2     3     4     5     6     7     8     9    10    11
 Frequency    465  1309  3437  9493  4333  3823  3413  2902  1059   247    26     3
 Proportion 0.015 0.043 0.113 0.311 0.142 0.125 0.112 0.095 0.035 0.008 0.001 0.000
 

pulse: Heart Rate beats/min
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
3051001570.99975.3819.5 50 55 62 73 86 98107
lowest : 0 1 6 9 20 , highest: 191 200 205 210 220
sysbp: Systolic Blood Pressure mmHg
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
3051001960.99912926.58 92.0100.0112.0129.5144.0160.0170.0
lowest : 0 36 40 43 46 , highest: 266 274 275 276 280
ant
nmissingdistinctInfoSumMeanGmd
30510020.713118660.38890.4753

miloc: MI Location
image
nmissingdistinct
3051003
 Value      Inferior    Other Anterior
 Frequency     17582     1062    11866
 Proportion    0.576    0.035    0.389
 

height: Height cm
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
3051005060.99917110.61154.9157.5165.0172.0177.8182.8185.4
lowest : 140.0 140.8 140.9 141.0 141.3 , highest: 207.0 208.0 208.2 210.8 212.5
weight: Weight kg
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
305100644179.4317.29 56.00 60.39 69.50 78.00 88.00100.00107.00
lowest : 36.7 37.0 38.0 38.5 38.6 , highest: 182.0 184.0 185.0 188.0 213.0
pmi: Previous MI
nmissingdistinct
3051002
 Value         no   yes
 Frequency  25452  5058
 Proportion 0.834 0.166
 

htn: History of Hypertension
nmissingdistinctInfoMeanGmd
30510020.7081.3810.4719
 Value          1     2
 Frequency  18875 11635
 Proportion 0.619 0.381
 

smk: Smoking
image
nmissingdistinct
3051003
 Value        never    quit current
 Frequency    13089    8371    9050
 Proportion   0.429   0.274   0.297
 

pan
nmissingdistinctInfoMeanGmd
30510020.7011.3720.4672
 Value          1     2
 Frequency  19165 11345
 Proportion 0.628 0.372
 

fam: Family History of MI
nmissingdistinctInfoMeanGmd
30510020.7331.4250.4889
 Value          1     2
 Frequency  17531 12979
 Proportion 0.575 0.425
 

prevcvd
nmissingdistinctInfoMeanGmd
30510020.061.020.04013
 Value          1     2
 Frequency  29885   625
 Proportion  0.98  0.02
 

prevcabg
nmissingdistinctInfoMeanGmd
30510020.1281.0450.08548
 Value          1     2
 Frequency  29145  1365
 Proportion 0.955 0.045
 

regl
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
305100160.9958.65.125 1 2 5 9121415
lowest : 1 2 3 4 5 , highest: 12 13 14 15 16
 Value          1     2     3     4     5     6     7     8     9    10    11    12
 Frequency   1634  2213  1526  2150  1427  1192  2342  2191  2326  1277  1856  3241
 Proportion 0.054 0.073 0.050 0.070 0.047 0.039 0.077 0.072 0.076 0.042 0.061 0.106
                                   
 Value         13    14    15    16
 Frequency   1721  2571  1921   922
 Proportion 0.056 0.084 0.063 0.030
 

grpl
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
305100480.99925.316.11 2 51426384446
lowest : 1 2 3 4 5 , highest: 44 45 46 47 48
grps
image
nmissingdistinctInfoMeanGmd.05.10.25.50.75.90.95
305100121161.5640.21 6 12 32 62 91110115
lowest : 1 2 3 4 5 , highest: 117 118 119 120 121
tpa
nmissingdistinctInfoSumMeanGmd
30510020.672103480.33920.4483

tx
nmissingdistinct
3051002
 Value         SK   tPA
 Frequency  20162 10348
 Proportion 0.661 0.339
 

Continuous variables

The nice thing of .html format is the possibility of using interactive plots. The plotly package allows you to render interactive plots. Simply place your cursor on top of the figure and all the information for a specific point will be shown.

p <- plot(d)
plotly::subplot(p[1])

Categorical variables

The nice thing of .html format is the possibility of using interactive plots. The plotly package allows you to render interactive plots. Simply place your cursor on top of the figure and all the information for a specific point will be shown.

plotly::subplot(p[2])

Missings

There are no missings.

dd <- rms::datadist(df); options(datadist = "dd")
na.patterns <- Hmisc::naclus(df)
Hmisc::naplot(na.patterns, 'na per var')
Fractions of NAs per Variable.

Fractions of NAs per Variable.

Table 1

You can create a “Table1” directly in your .Rmd report, like done below.

df_table1 <- df %>% 
  select(
    tx,
    age,
    sex,
    Killip,
    pulse,
    sysbp,
    htn,
    dia,
    smk,
    pmi,
    ste,
    day30
    ) %>% 
  mutate(htn = ifelse(htn == 2, 1, 0))

tbl_gtsummary_custom <- 
  gtsummary::tbl_summary(
    df_table1,
    label = list(pulse ~ "Heart rate", 
                 sysbp ~ "Systolic Blood Pressure",
                 htn ~ "History of Hypertension",
                 dia ~ "Diabetes",
                 smk ~ "Smoking", 
                 pmi ~ "Previous MI",
                 ste ~ "Nbr ECG leads with ST-elevation",
                 day30 ~ "Dead in the first 30-days"), 
    by = tx, # split table by treatment received
    missing = "no" # don't list missing data separately
  ) %>%
  gtsummary::add_overall() %>% # add column with overall summary statistics
  gtsummary::add_n() %>% # add column with total number of non-missing observations
  gtsummary::modify_header(label = "**Variable**") # update the column header

#Add row headers
tbl_gtsummary_custom <- tbl_gtsummary_custom %>% 
  gtsummary::as_gt() %>% #convert to gt object and use gt functions
  gt::tab_row_group(label = "Outcome", rows = 20) %>% 
  gt::tab_row_group(label = "ECG findings", rows = 19) %>% 
  gt::tab_row_group(label = "CVRF, n (%)", rows = 12:18) %>% 
  gt::tab_row_group(label = "Vital signs", rows = 5:11) %>% 
  gt::tab_row_group(label = "Demographics", rows = 1:4) %>% 
  gt::tab_style(
    style = list(
      cell_fill(color = "#EFEFEF"),
      cell_text(weight = "bold")
    ),
    locations = cells_row_groups()
  )

tbl_gtsummary_custom
Variable N Overall, N = 30,5101 SK, N = 20,1621 tPA, N = 10,3481
Demographics
age 30,510 62 (52, 70) 62 (52, 70) 62 (52, 70)
Sex 30,510
    male 22,795 (75%) 15,070 (75%) 7,725 (75%)
    female 7,715 (25%) 5,092 (25%) 2,623 (25%)
Vital signs
Killip Class 30,510
    I 26,007 (85%) 17,209 (85%) 8,798 (85%)
    II 3,857 (13%) 2,529 (13%) 1,328 (13%)
    III 417 (1.4%) 275 (1.4%) 142 (1.4%)
    IV 229 (0.8%) 149 (0.7%) 80 (0.8%)
Heart rate 30,510 73 (62, 86) 73 (62, 86) 73 (62, 86)
Systolic Blood Pressure 30,510 130 (112, 144) 130 (112, 144) 130 (112, 144)
CVRF, n (%)
History of Hypertension 30,510 11,635 (38%) 7,679 (38%) 3,956 (38%)
Diabetes 30,510 4,543 (15%) 3,040 (15%) 1,503 (15%)
Smoking 30,510
    never 13,089 (43%) 8,654 (43%) 4,435 (43%)
    quit 8,371 (27%) 5,542 (27%) 2,829 (27%)
    current 9,050 (30%) 5,966 (30%) 3,084 (30%)
Previous MI 30,510 5,058 (17%) 3,320 (16%) 1,738 (17%)
ECG findings
Nbr ECG leads with ST-elevation 30,510 4 (3, 6) 4 (3, 6) 4 (3, 5)
Outcome
Dead in the first 30-days 30,510 2,128 (7.0%) 1,475 (7.3%) 653 (6.3%)
1 Median (IQR); n (%)

However, you can also insert previously created html tables in your report. This is very handy because you do not need to write two times the same code. Usually you will have your full syntax code written in a .R syntax file and you will later create an .Rmd report for reporting main results, tables, plots, etc. If this is the case, you can save the table in html format from within your .R syntax file and insert it into your .Rmd file with the htmltools package.

rawHTML_Table1 <- paste(readLines(glue::glue("./Tables/Table1.html")), collapse = "\n")
htmltools::HTML(rawHTML_Table1)

Model

We are going to fit a logistic regression model. We want to obtain predicted probabilities with the model. Probabilities are obtained with the equation below.

\[ Probability = \frac{1}{1 + e^{(-(lp))}} \text{, where lp = } \alpha + x_1\beta_1 + x_2\beta_2 + x_n\beta_n \]

Yes, you can insert equations in R Markdown with LaTeX notation. Take a look at different LaTeX symbols.

options(prType = 'html') #set print type option to html. Default is 'plain'
model <- rms::lrm(day30 ~ rcs(age, 4)*sex + tx + Killip + rcs(pulse, 4) +
               rcs(sysbp, 4) + miloc + pmi, x = T, y = T, data = df)
print(model, digits = 4, table = TRUE, conf.int = 0.95, 
      coefs = TRUE, title = 'Logistic regression model')
Logistic regression model
 rms::lrm(formula = day30 ~ rcs(age, 4) * sex + tx + Killip + 
     rcs(pulse, 4) + rcs(sysbp, 4) + miloc + pmi, data = df, x = T, 
     y = T)
 
Model Likelihood
Ratio Test
Discrimination
Indexes
Rank Discrim.
Indexes
Obs 30510 LR χ2 3071.49 R2 0.241 C 0.820
0 28382 d.f. 20 g 1.319 Dxy 0.640
1 2128 Pr(>χ2) <0.0001 gr 3.740 γ 0.641
max |∂log L/∂β| 2×10-10 gp 0.081 τa 0.083
Brier 0.055
β S.E. Wald Z Pr(>|Z|)
Intercept  -1.0932  0.8338 -1.31 0.1898
age   0.0281  0.0147 1.91 0.0564
age'   0.0664  0.0326 2.03 0.0421
age''  -0.1643  0.1202 -1.37 0.1716
sex=female  -0.0077  1.7377 0.00 0.9965
tx=tPA  -0.2094  0.0530 -3.95 <0.0001
Killip=II   0.6094  0.0592 10.29 <0.0001
Killip=III   1.1516  0.1215 9.48 <0.0001
Killip=IV   1.8972  0.1635 11.60 <0.0001
pulse  -0.0070  0.0065 -1.08 0.2816
pulse'   0.0868  0.0234 3.71 0.0002
pulse''  -0.2355  0.0673 -3.50 0.0005
sysbp  -0.0409  0.0029 -14.34 <0.0001
sysbp'   0.0580  0.0099 5.86 <0.0001
sysbp''  -0.1467  0.0382 -3.84 0.0001
miloc=Other   0.2985  0.1352 2.21 0.0273
miloc=Anterior   0.5623  0.0516 10.90 <0.0001
pmi=yes   0.4850  0.0568 8.54 <0.0001
age × sex=female  -0.0017  0.0349 -0.05 0.9609
age' × sex=female   0.0654  0.0697 0.94 0.3481
age'' × sex=female  -0.3171  0.2404 -1.32 0.1871

Visualize Model

Suppose we would like to show different ways of visualizing/interpreting our model. We could show each Figure one below the other, but there is a more elegant way. Tabsets! Tabsets are very useful when wanting to report more than 1 Figure or Table.

Nomogram

plot(nomogram(model))
Nomogram.

Nomogram.

Partial (conditional) effect plots

ggplot(rms::Predict(model, fun = plogis), rdata = df)
Partial (conditional) effect plots. Probability scale instead of default log odds scale.

Partial (conditional) effect plots. Probability scale instead of default log odds scale.

More

sbp <- c(60, 80, 100, 120)
ggplot(rms::Predict(model, age, sysbp = sbp, fun = plogis)) 

and more

ggplot(rms::Predict(model, age, sysbp = sbp, Killip, fun = plogis))

Nested tabset

We can embedd one tabset in another.

heatmap

pred_intx <- rms::Predict(model, age, sysbp, Killip, fun = plogis, np = 100) 
col <- base::rev(grDevices::heat.colors(999, alpha = 0.8))
bounds <- rms::perimeter(df$age, df$sysbp, lowess. = T) #to see where data really exists

rms::bplot(
  pred_intx,
  perim = bounds,
  ylab = "sysbp",
  col.regions = col
  )
Heatmap showing predicted probabilities at different ages and systolic blood pressure for the 4 Killip classes.

Heatmap showing predicted probabilities at different ages and systolic blood pressure for the 4 Killip classes.

Wireplot 3D

rms::bplot(
  pred_intx,
  lfun = wireframe,
  perim = bounds,
  ylab = "sysbp",
  zlab = "Pr(30-day Death)",
  shade = T
  )
3D plot

3D plot

Conclusion

Hope this template will aid you when creating a .html report file with R Markdown. Note that since RStudio version version 1.4 or higher, R Markdown has incorporated a visual editing mode, making all aspects of creating and rendering a report very easy.

Computing Environment

sessioninfo::session_info(pkgs = c("attached"), include_base = T)
## ─ Session info ───────────────────────────────────────────────────────────────
##  setting  value
##  version  R version 4.1.2 (2021-11-01)
##  os       macOS Catalina 10.15.5
##  system   x86_64, darwin17.0
##  ui       X11
##  language (EN)
##  collate  en_US.UTF-8
##  ctype    en_US.UTF-8
##  tz       Europe/Zurich
##  date     2022-12-04
##  pandoc   2.18 @ /Applications/RStudio.app/Contents/MacOS/quarto/bin/tools/ (via rmarkdown)
## 
## ─ Packages ───────────────────────────────────────────────────────────────────
##  ! package      * version date (UTC) lib source
##    base         * 4.1.2   2021-11-01 [?] local
##    consort      * 1.0.1   2021-12-20 [1] CRAN (R 4.1.0)
##  P datasets     * 4.1.2   2021-11-01 [1] local
##    dplyr        * 1.0.9   2022-04-28 [1] CRAN (R 4.1.2)
##    Formula      * 1.2-4   2020-10-16 [1] CRAN (R 4.1.0)
##    ggplot2      * 3.3.5   2021-06-25 [1] CRAN (R 4.1.0)
##    glue         * 1.6.2   2022-02-24 [1] CRAN (R 4.1.2)
##  P graphics     * 4.1.2   2021-11-01 [1] local
##  P grDevices    * 4.1.2   2021-11-01 [1] local
##    gt           * 0.7.0   2022-08-25 [1] CRAN (R 4.1.2)
##    gtsummary    * 1.6.2   2022-09-30 [1] CRAN (R 4.1.2)
##    Hmisc        * 4.6-0   2021-10-07 [1] CRAN (R 4.1.0)
##    htmlwidgets  * 1.5.4   2021-09-08 [1] CRAN (R 4.1.0)
##    lattice      * 0.20-45 2021-09-22 [1] CRAN (R 4.1.2)
##  P methods      * 4.1.2   2021-11-01 [1] local
##    rms          * 6.2-0   2021-03-18 [1] CRAN (R 4.1.0)
##    SparseM      * 1.81    2021-02-18 [1] CRAN (R 4.1.0)
##  P stats        * 4.1.2   2021-11-01 [1] local
##    summarytools * 1.0.1   2022-05-20 [1] CRAN (R 4.1.2)
##    survival     * 3.4-0   2022-08-09 [1] CRAN (R 4.1.2)
##  P utils        * 4.1.2   2021-11-01 [1] local
## 
##  [1] /Library/Frameworks/R.framework/Versions/4.1/Resources/library
## 
##  P ── Loaded and on-disk path mismatch.
## 
## ──────────────────────────────────────────────────────────────────────────────

Update History

You can create an update history section to show changes in your report over time.(e.g. someone asks for a new analysis, changes, etc).

Click to expand
Date Changes
2022-12-04 Added example of how to insert references
2022-12-04 Added example of how to create a nested tabset
2022-12-03 Published

References

Allaire, JJ, Yihui Xie, Jonathan McPherson, Javier Luraschi, Kevin Ushey, Aron Atkins, Hadley Wickham, Joe Cheng, Winston Chang, and Richard Iannone. 2022. Rmarkdown: Dynamic Documents for r. https://CRAN.R-project.org/package=rmarkdown.
Xie, Yihui. 2022. Knitr: A General-Purpose Package for Dynamic Report Generation in r. https://yihui.org/knitr/.
Xie, Yihui, J. J. Allaire, and Garrett Grolemund. 2018. R Markdown: The Definitive Guide. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown.
Xie, Yihui, Christophe Dervieux, and Emily Riederer. 2020. R Markdown Cookbook. Boca Raton, Florida: Chapman; Hall/CRC. https://bookdown.org/yihui/rmarkdown-cookbook.

  1. To learn prediction modelling the go-to source is Frank Harrell https://hbiostat.org/rmsc/ or Ewout Steyerberg https://link.springer.com/book/10.1007/978-0-387-77244-8↩︎